43 research outputs found

    LEEC: A Legal Element Extraction Dataset with an Extensive Domain-Specific Label System

    Full text link
    As a pivotal task in natural language processing, element extraction has gained significance in the legal domain. Extracting legal elements from judicial documents helps enhance interpretative and analytical capacities of legal cases, and thereby facilitating a wide array of downstream applications in various domains of law. Yet existing element extraction datasets are limited by their restricted access to legal knowledge and insufficient coverage of labels. To address this shortfall, we introduce a more comprehensive, large-scale criminal element extraction dataset, comprising 15,831 judicial documents and 159 labels. This dataset was constructed through two main steps: first, designing the label system by our team of legal experts based on prior legal research which identified critical factors driving and processes generating sentencing outcomes in criminal cases; second, employing the legal knowledge to annotate judicial documents according to the label system and annotation guideline. The Legal Element ExtraCtion dataset (LEEC) represents the most extensive and domain-specific legal element extraction dataset for the Chinese legal system. Leveraging the annotated data, we employed various SOTA models that validates the applicability of LEEC for Document Event Extraction (DEE) task. The LEEC dataset is available on https://github.com/THUlawtech/LEEC

    Differential Fault Attack on KASUMI Cipher Used in GSM Telephony

    Get PDF
    The confidentiality of GSM cellular telephony depends on the security of A5 family of cryptosystems. As an algorithm in this family survived from cryptanalysis, A5/3 is based on the block cipher KASUMI. This paper describes a novel differential fault attack on KAUSMI with a 64-bit key. Taking advantage of some mathematical observations on the FL, FO functions, and key schedule, only one 16-bit word fault is required to recover all information of the 64-bit key. The time complexity is only 232 encryptions. We have practically simulated the attack on a PC which takes only a few minutes to recover all the key bits. The simulation also experimentally verifies the correctness and complexity

    A Comprehensive Survey on Deep Learning Techniques in Educational Data Mining

    Full text link
    Educational Data Mining (EDM) has emerged as a vital field of research, which harnesses the power of computational techniques to analyze educational data. With the increasing complexity and diversity of educational data, Deep Learning techniques have shown significant advantages in addressing the challenges associated with analyzing and modeling this data. This survey aims to systematically review the state-of-the-art in EDM with Deep Learning. We begin by providing a brief introduction to EDM and Deep Learning, highlighting their relevance in the context of modern education. Next, we present a detailed review of Deep Learning techniques applied in four typical educational scenarios, including knowledge tracing, undesirable student detecting, performance prediction, and personalized recommendation. Furthermore, a comprehensive overview of public datasets and processing tools for EDM is provided. Finally, we point out emerging trends and future directions in this research area.Comment: 21 pages, 5 figure

    Linear Regression Side Channel Attack Applied on Constant XOR

    Get PDF
    Linear regression side channel attack (LRA) used to be known as a robust attacking method as it makes use of independent bits leakage. This leakage assumption is more general than Hamming weight/ Hamming distance model used in correlation power attack (CPA). However, in practice, Hamming weight and Hamming distance model suit most devices well. In this paper, we restudy linear regression attack under Hamming weight/ Hamming distance model and propose our novel LRA methods. We find that in many common scenarios LRA is not only an alternative but also a more efficient tool compared with CPA. Two typical cases are recovering keys with XOR operation leakage and chosen plaintext attack on block ciphers with leakages from round output. Simulation results are given to compare with traditional CPA in both cases. Our LRA method achieves up to 400% and 300% improvements for corresponding case compared with CPA respectively. Experiments with AES on SAKURA-G board also prove the efficiency of our methods in practice where 128 key bits are recovered with 1500 traces using XOR operation leakage and one key byte is recovered with only 50 chosen-plaintext traces in the other case

    Two Improved Multiple-Differential Collision Attacks

    Get PDF
    In CHES 2008, Bogdanov proposed multiple-differential collision attacks which could be applied to the power analysis attacks on practical cryptographic systems. However, due to the effect of countermeasures on FPGA, there are some difficulties during the collision detection, such as local high noise and the lack of sampling points. In this paper, keypoints voting test is proposed for solving these problems, which can increase the success ratio from 35% to 95% on the example of one implementation. Furthermore, we improve the ternary voting test of Bogdanov, which can improve the experiment efficiency markedly. Our experiments show that the number of power traces required in our attack is only a quarter of the requirement of traditional attack. Finally, some alternative countermeasures against our attacks are discussed

    Cryptanalysis of GOST R hash function

    No full text
    International audienceGOST R 34.11-2012 is the new Russian hash function standard. This paper presents some cryptanalytic results on GOST R. Using the rebound attack technique, we achieve collision attacks on the reduced round compression function. Result on up to 9.5 rounds is proposed, the time complexity is 2176 and the memory requirement is 2128 bytes. Based on the 9.5-round collision result, a limited birthday distinguisher is presented. More over, a k-collision on 512-bit version of GOST R is constructed which shows the weakness of the structure used in GOST R

    Pipelined XPath Query Based on Cost Optimization

    No full text
    XPath query is the key part of XML data processing, and its performance is usually critical for XML applications. In the process of XPath query, there is inherent seriality between query steps, which makes it difficult to parallelize the query effectively as a whole. On the other hand, although XPath query has the characteristics of data stream processing and is suitable for pipeline processing, the data flow of each query step usually varies a lot, which results in limited performance under multithreading conditions. In this paper, we propose a pipelined XPath query method (PXQ) based on cost optimization. This method uses pipelined query primitives to process query steps based on relation index. During pipeline construction, a cost estimation model based on XML statistics is proposed to estimate the cost of the query primitive and provide guidance for the creation of a pipeline phase through the partition of query primitive sequence. The pipeline construction technique makes full use of available worker threads and optimizes the load balance between pipeline stages. The experimental results show that our method can adapt to the multithreaded environment and stream processing scenarios of XPath query, and its performance is better than the existing typical query methods based on data parallelism

    Fault Rate Analysis: Breaking Masked AES Hardware Implementations Efficiently

    No full text
    International audienceIn 2011, Li presented clockwise collision analysis on nonprotected Advanced Encryption Standard (AES) hardware implementation. In this brief, we first propose a new clockwise collision attack, called fault rate analysis (FRA), on masked AES. Then, we analyze the critical and noncritical paths of the S-box and find that, for its three input bytes, namely, the input value, the input mask, and the output mask, the path relating to the output mask is much shorter than those relating to the other two inputs. Therefore, some sophisticated glitch cycles can be chosen such that the values in the critical path of the whole S-box are destroyed but this short path is not affected. As a result, the output mask does not offer protection to the S-box, which leads to a more efficient attack. Compared with three attacks on masking countermeasures at the Workshop on Cryptographic Hardware and Embedded Systems 2010 and 2011, our method only costs about 8% of their time and 4% of their storage space
    corecore